Skip to content

Conversation

@JohannesGaessler
Copy link
Collaborator

I noticed that on my machine with an Epyc 7742 test-backend-ops was disproportionally slow. The problem seems to be that for a lot of tests the tensors are so small that the dominating contribution of the runtime is just thread management. This PR scales the number of threads based on the number of threads which makes the tests quite a bit faster. With my Ryzen 5950X the runtime decreases from 63s to 26s, on my Epyc 7742 it decreases from 519s to 55s.

Right now the minimum number of threads is 1; should we increase this to 2 for ggml graph evaluations to ensure that multithreading is also tested?

@github-actions github-actions bot added the testing Everything test related label Mar 6, 2025
@jeffbolznv
Copy link
Collaborator

I did not see a perf increase from this change, maybe a slight slowdown (41s -> 43s). Tested on an i9-14900k using the Vulkan backend.

For me, about half the runtime in test-backend-ops is spent setting up the lookup tables for the IQ formats for the CPU backend. It would be nice to have these precomputed.

{
// parallel initialization
static const size_t n_threads = std::thread::hardware_concurrency();
static const size_t n_threads = get_n_threads(ggml_nelements(tensor));
Copy link
Member

@slaren slaren Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that it makes sense to initialize this from the current tensor, since this is a one-time RNG initialization. It should be done with the maximum number of threads that may be used.

To clarify: this variable is also used later to determine the number of threads to use to initialize the tensor, but it is initialized here as a static and used to initialize the static RNGs. It is necessary to initialize as many RNGs as threads may be used, but you can use a different number of threads for each tensor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants